49 research outputs found
I/O-Efficient Dynamic Planar Range Skyline Queries
We present the first fully dynamic worst case I/O-efficient data structures
that support planar orthogonal \textit{3-sided range skyline reporting queries}
in \bigO (\log_{2B^\epsilon} n + \frac{t}{B^{1-\epsilon}}) I/Os and updates
in \bigO (\log_{2B^\epsilon} n) I/Os, using \bigO
(\frac{n}{B^{1-\epsilon}}) blocks of space, for input planar points,
reported points, and parameter . We obtain the result
by extending Sundar's priority queues with attrition to support the operations
\textsc{DeleteMin} and \textsc{CatenateAndAttrite} in \bigO (1) worst case
I/Os, and in \bigO(1/B) amortized I/Os given that a constant number of blocks
is already loaded in main memory. Finally, we show that any pointer-based
static data structure that supports \textit{dominated maxima reporting
queries}, namely the difficult special case of 4-sided skyline queries, in
\bigO(\log^{\bigO(1)}n +t) worst case time must occupy space, by adapting a similar lower bounding argument for
planar 4-sided range reporting queries.Comment: Submitted to SODA 201
Longest Common Subsequence on Weighted Sequences
We consider the general problem of the Longest Common Subsequence (LCS) on weighted sequences. Weighted sequences are an extension of classical strings, where in each position every letter of the alphabet may occur with some probability. Previous results presented a PTAS and noticed that no FPTAS is possible unless P=NP. In this paper we essentially close the gap between upper and lower bounds by improving both. First of all, we provide an EPTAS for bounded alphabets (which is the most natural case), and prove that there does not exist any EPTAS for unbounded alphabets unless FPT=W[1]. Furthermore, under the Exponential Time Hypothesis, we provide a lower bound which shows that no significantly better PTAS can exist for unbounded alphabets. As a side note, we prove that it is sufficient to work with only one threshold in the general variant of the problem
A Space-Optimal Hidden Surface Removal Algorithm for Iso-Oriented Rectangles
We investigate the problem of finding the visible pieces of a scene of
objects from a specified viewpoint. In particular, we are interested in the
design of an efficient hidden surface removal algorithm for a scene comprised
of iso-oriented rectangles. We propose an algorithm where given a set of
iso-oriented rectangles we report all visible surfaces in time
and linear space, where is the number of surfaces reported. The previous
best result by Bern, has the same time complexity but uses space
I/O-Efficient Planar Range Skyline and Attrition Priority Queues
In the planar range skyline reporting problem, we store a set P of n 2D
points in a structure such that, given a query rectangle Q = [a_1, a_2] x [b_1,
b_2], the maxima (a.k.a. skyline) of P \cap Q can be reported efficiently. The
query is 3-sided if an edge of Q is grounded, giving rise to two variants:
top-open (b_2 = \infty) and left-open (a_1 = -\infty) queries.
All our results are in external memory under the O(n/B) space budget, for
both the static and dynamic settings:
* For static P, we give structures that answer top-open queries in O(log_B n
+ k/B), O(loglog_B U + k/B), and O(1 + k/B) I/Os when the universe is R^2, a U
x U grid, and a rank space grid [O(n)]^2, respectively (where k is the number
of reported points). The query complexity is optimal in all cases.
* We show that the left-open case is harder, such that any linear-size
structure must incur \Omega((n/B)^e + k/B) I/Os for a query. We show that this
case is as difficult as the general 4-sided queries, for which we give a static
structure with the optimal query cost O((n/B)^e + k/B).
* We give a dynamic structure that supports top-open queries in O(log_2B^e
(n/B) + k/B^1-e) I/Os, and updates in O(log_2B^e (n/B)) I/Os, for any e
satisfying 0 \le e \le 1. This leads to a dynamic structure for 4-sided queries
with optimal query cost O((n/B)^e + k/B), and amortized update cost O(log
(n/B)).
As a contribution of independent interest, we propose an I/O-efficient
version of the fundamental structure priority queue with attrition (PQA). Our
PQA supports FindMin, DeleteMin, and InsertAndAttrite all in O(1) worst case
I/Os, and O(1/B) amortized I/Os per operation.
We also add the new CatenateAndAttrite operation that catenates two PQAs in
O(1) worst case and O(1/B) amortized I/Os. This operation is a non-trivial
extension to the classic PQA of Sundar, even in internal memory.Comment: Appeared at PODS 2013, New York, 19 pages, 10 figures. arXiv admin
note: text overlap with arXiv:1208.4511, arXiv:1207.234
Investigation of Database Models for Evolving Graphs
We deal with the efficient implementation of storage models for time-varying graphs. To this end, we present an improved approach for the HiNode vertex-centric model based on MongoDB. This approach, apart from its inherent space optimality, exhibits significant improvements in global query execution times, which is the most challenging query type for entity-centric approaches. Not only significant speedups are achieved but more expensive queries can be executed as well, when compared to an implementation based on Cassandra due to the capability to exploit indices to a larger extent and benefit from in-database query processing
Continuous Outlier Mining of Streaming Data in Flink
In this work, we focus on distance-based outliers in a metric space, where
the status of an entity as to whether it is an outlier is based on the number
of other entities in its neighborhood. In recent years, several solutions have
tackled the problem of distance-based outliers in data streams, where outliers
must be mined continuously as new elements become available. An interesting
research problem is to combine the streaming environment with massively
parallel systems to provide scalable streambased algorithms. However, none of
the previously proposed techniques refer to a massively parallel setting. Our
proposal fills this gap and investigates the challenges in transferring
state-of-the-art techniques to Apache Flink, a modern platform for intensive
streaming analytics. We thoroughly present the technical challenges encountered
and the alternatives that may be applied. We show speed-ups of up to 117 (resp.
2076) times over a naive parallel (resp. non-parallel) solution in Flink, by
using just an ordinary four-core machine and a real-world dataset. When moving
to a three-machine cluster, due to less contention, we manage to achieve both
better scalability in terms of the window slide size and the data
dimensionality, and even higher speed-ups, e.g., by a factor of 510. Overall,
our results demonstrate that oulier mining can be achieved in an efficient and
scalable manner. The resulting techniques have been made publicly available as
open-source software
Threshold-Based Network Structural Dynamics
The interest in dynamic processes on networks is steadily rising in recent
years. In this paper, we consider the -Thresholded Network
Dynamics (-Dynamics), where , in which only
structural dynamics (dynamics of the network) are allowed, guided by local
thresholding rules executed in each node. In particular, in each discrete round
, each pair of nodes and that are allowed to communicate by the
scheduler, computes a value (the potential of the pair) as a
function of the local structure of the network at round around the two
nodes. If then the link (if it exists) between
and is removed; if then an existing
link among and is maintained; if then a
link between and is established if not already present.
The microscopic structure of -Dynamics appears to be simple,
so that we are able to rigorously argue about it, but still flexible, so that
we are able to design meaningful microscopic local rules that give rise to
interesting macroscopic behaviors. Our goals are the following: a) to
investigate the properties of the -Thresholded Network Dynamics
and b) to show that -Dynamics is expressive enough to solve
complex problems on networks.
Our contribution in these directions is twofold. We rigorously exhibit the
claim about the expressiveness of -Dynamics, both by designing
a simple protocol that provably computes the -core of the network as well as
by showing that -Dynamics is in fact Turing-Complete. Second
and most important, we construct general tools for proving stabilization that
work for a subclass of -Dynamics and prove speed of convergence
in a restricted setting.Comment: 29 pages, extension of the Post-print containing all proofs, to
appear in SIROCCO 202